-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix infinite loop during open-ended range query -- CouchDB #1347
Conversation
We use a single scanner for achieving both paginated and non-paginated range query. We have internalQueryLimit and pageSize. For each _all_docs?startKey="XXX"&endKey="YYY" REST API call to CouchDB, we fetch atmost internalQueryLimit number of records only by appending limit=internalQueryLimit. When the requested pageSize is higher than the internalQueryLimit or the total number of records present in the given range is higher than the internalQueryLimit, iterator would execute the query again once the records fetched in the first cycle is consumed and so on. In order to do that, after an execution of the REST API in each cycle, it updates the initially passed startKey to the nextStartKey. If there is no nextStartKey, it is set to the endKey. Currently, when the nextStartKey and the endKey are the same, we still run one REST API call which is actually not needed as we always set inclusive_end=false. However, this causes infinite loop in a particular case. When we want to retrieve all the records, we would pass an empty string as the startKey and the endKey. When the startKey is an empty string, the REST API call would become _all_docs?endKey="YYY". When both are empty, it would become _all_docs Given that we set startKey to endKey when there is no nextStartKey and still execute one REST API call, it gets into infinite loop by fetching all records again and again. We avovid this infinite loop by setting scanner.exhausted = true whe the startKey and endKey become the same when there is no nextStartKey Signed-off-by: senthil <cendhu@gmail.com>
scanner.paginationInfo.cursor = 0 | ||
if scanner.queryDefinition.endKey == nextStartKey { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to make this check more explicit by having this either of the following
nextStartKey == ""
if nextStartKey >= endKey || nextStartKey == ""
In the current form, this check appears to be a special case of the (2) above and may cause confusion that why equal check only and why not greater than (imagine someone looking at this code and having a generic key (not an ""
) as the end key in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nextStartKey
would never be greater than the endKey
. The current condition is very generic. When the scanner is exhausted, nextStartKey == endKey
would always become true and any condition that we have after this condition would become a deadcode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the PR description, the current condition check avoids an additional REST API call once the scanner is exhaused. Let's assume startKey = "marble1"
and endKey = "marble5"
. It is possible that the startKey
and endKey
would become the same, i.e., marble5. Once it has become the same, there is no need to run an additional REST API call. The current condition takes care of all scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. just looked at the code and found this logic of explicitly returning the end key as next start, deep buried :-) Thanks for the explanation. I feel that the logic of limits and pagination is much spread across functions and needs a refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Manish, for merging this. I agree. It is much spread and complex to read.
* Address more concerns highlighted by linters These changes remove dead code, add error checks, and use assign unused variables to the unnamed variable `_`. Signed-off-by: Matthew Sykes <sykesmat@us.ibm.com> * Fixed write_first_app.rst typo Signed-off-by: pratikpatil024 <pratikspatil024@gmail.com> * FAB-17840 Ch.Part.API: Join channel REST handler (hyperledger#1305) * FAB-17840 Ch.Part.API: Join channel REST handler Implement a handler for a POST request to join a channel. Here we support requests carrying application/json, support for additional Content-Type will be added later. Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: I8d09e3cba09842f2adc47fb60161eee814e33d31 * Review: simplify code Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: Iee1e8b66cb77f64b9762dee8f85304958081e0fe * Review comments: simplify code - extract error handling to separate method - extract channelID extraction to separate method, reuse - test Accept header allowed range - better comments Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: I11639a3f159694019e521e180e7fd5fadb42cb4f * Review comments: support for multipart/form-data Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: Ic5a0307f56be5561f45910a02892dc7e7b9554d1 * Review comments: remove support for application/json Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: I38bb3564a0e6482b7bf9dca25bd8424e6db2ac95 * Spelling: s/chainocde/chaincode/g Signed-off-by: Matthew Sykes <matthew.sykes@gmail.com> * Retire dormant Fabric maintainers Retire maintainers that have been inactive for the last 3 months: Jonathan Levi Srinivasan Muralidharan Note: "A maintainer removed for inactivity should be restored following a sustained resumption of contributions and reviews (a month or more) demonstrating a renewed commitment to the project." Signed-off-by: David Enyeart <enyeart@us.ibm.com> * add NOTICE file recommended per ASF and LF policy Signed-off-by: Brett Logan <brett.t.logan@ibm.com> * Add NOTICE to license ignore list Signed-off-by: Brett Logan <brett.t.logan@ibm.com> * Fix some typo in docs This CR fixes the following some typo in docs. - chainocode -> chaincode - chainode -> chaincode - lifecyle -> lifecycle - Hyperlegder -> Hyperledger - chanel -> channel - scructured -> structured - demostrate -> demonstrate - certficates -> certificates - how how -> how - the the -> the - a a -> a - thefollowing -> the following Signed-off-by: Nao Nishijima <nao.nishijima.xt@hitachi.com> * Remove txmgr interface This commit removes the interface TxMgr, as there is a single implementation (i.e., LockbasedTxmgr). All the dependent packages are able to use this single implementation directly. The only exception is the validation package that causes a circular dependency. the validation package uses only a single function from this interface (NewTxSimulator). The validation package uses this function for getting accessing to the TxSimulator, in order to allow for the execution a post-order transaction (only channel config transaction in the current code). For meeting this need, this commit now defines a local interface with a single function in the validation package itself. Signed-off-by: manish <manish.sethi@gmail.com> * FAB-17841: Ch.Part.API: Remove channel REST handler (hyperledger#1330) Handle DELETE request to remove a channel. - Resolve the removeStorage flag from config & query. - If system-channel exists - reject with StatusMethodNotAllowed and Allow header set to GET. - If channel does not exist - reject with StatusNotFound. - If successs - respond with StatusNoContent. Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: I78581df3c7f0cb99007edddc83eb7a6dca5eba07 * Update commercial paper doc to use enrollUser.js Signed-off-by: NIKHIL E GUPTA <negupta@us.ibm.com> * Use protolator from fabric-config Signed-off-by: Matthew Sykes <matthew.sykes@gmail.com> * Revert "Bump viper version to the last working commit" This reverts commit 5ad0a4f. Signed-off-by: Brett Logan <brett.t.logan@ibm.com> * _lifecycle ignore previous build failure during install (hyperledger#1280) When a chaincode build previously failed while installing the chaincode, _lifecycle should ignore the cached error and attempt to rebuild the chaincode. This is because we should assume the end user knows something about why the build may succeed on retry if they're reattempting an install. Also, update integration tests to not care about exit status of chaincode installs (since reinstalls now error). FAB-17907 Signed-off-by: Will Lahti <wtlahti@us.ibm.com> * [FAB-17927] Add AllChaincodesInfo to DeployedChaincodeInfoProvider (hyperledger#1331) Implement AllChaincodesInfo to query ledger and return chaincode info for all the deployed chaincode (both new lifecycle and legacy chaincodes) on a channel. This function is needed for ledger checkpointing and deletion of channel-specific databases in statecouchdb. Signed-off-by: Wenjian Qiao <wenjianq@gmail.com> * [FAB-17471] Fix OrdererType key to correct line SampleInsecureKafka profile tries to change consensus type to kafka using OrdererType. But it was written in the line above "<<: *OrdererDefaults". That causes the overwrite consensus type again. This CR moves OrdererType to the correct line. As a result, this CR can generate the correct ordererer type's genesis block. Signed-off-by: Nao Nishijima <nao.nishijima.xt@hitachi.com> * [FAB-17059] Add missing mspID when init MembershipProvider. Signed-off-by: Hongbin Mao <hello2mao@gmail.com> * fsync and do not generate empty files in snapshots (hyperledger#1345) This commit fixes following issues - Perform sync on the snapshot files - Does not generate empty files - Use filepath instead of path package Signed-off-by: manish <manish.sethi@gmail.com> * fix infinite loop during full range query (hyperledger#1347) We use a single scanner for achieving both paginated and non-paginated range query. We have internalQueryLimit and pageSize. For each _all_docs?startKey="XXX"&endKey="YYY" REST API call to CouchDB, we fetch atmost internalQueryLimit number of records only by appending limit=internalQueryLimit. When the requested pageSize is higher than the internalQueryLimit or the total number of records present in the given range is higher than the internalQueryLimit, iterator would execute the query again once the records fetched in the first cycle is consumed and so on. In order to do that, after an execution of the REST API in each cycle, it updates the initially passed startKey to the nextStartKey. If there is no nextStartKey, it is set to the endKey. Currently, when the nextStartKey and the endKey are the same, we still run one REST API call which is actually not needed as we always set inclusive_end=false. However, this causes infinite loop in a particular case. When we want to retrieve all the records, we would pass an empty string as the startKey and the endKey. When the startKey is an empty string, the REST API call would become _all_docs?endKey="YYY". When both are empty, it would become _all_docs Given that we set startKey to endKey when there is no nextStartKey and still execute one REST API call, it gets into infinite loop by fetching all records again and again. We avovid this infinite loop by setting scanner.exhausted = true whe the startKey and endKey become the same when there is no nextStartKey Signed-off-by: senthil <cendhu@gmail.com> * Remove unnecessary extension of osn (hyperledger#1351) In the integration test touched by this patch, cert file is read to remove and add node. It can be easily achieved by reading file bytes directly, without extending network to grab intermediate object. Signed-off-by: Jay Guo <guojiannan1101@gmail.com> * [FAB-17935] Change unnecessary warning log line to debug in gossip (hyperledger#1350) privdata Signed-off-by: Danny Cao <dcao@us.ibm.com> * [FAB-17900] Update BCCSP.PKCS11.Pin in examples - The Pin value must be quoted when specified in the yaml Signed-off-by: Tiffany Harris <tiffany.harris@ibm.com> * [FAB-17900] Fixes numeric env variable override bug Signed-off-by: Tiffany Harris <tiffany.harris@ibm.com> * Remove s390x, powerpc64le from RELEASE_PLATFORMS - Release automation only creates amd64 binaries for darwin, linux, and windows. - Continuous integration no longer runs on powerpc64le or s390x Also remove stale build tags related to plugins and race detection for old versions of go, s390x, and ppc64le. Signed-off-by: Matthew Sykes <sykesmat@us.ibm.com> * Backfill test for BCCSP environment overrides... ... and consistently use SW as the key for `SwOpts` in the configuration structures. Right now tags for mapstructure and JSON do not match the tags for YAML yet our sample configuration documents (in YAML) use `SW`. Signed-off-by: Matthew Sykes <matthew.sykes@gmail.com> * Updates in master for v2.1.1 release Update master doc and bootstrap script for v2.1.1 release. Signed-off-by: David Enyeart <enyeart@us.ibm.com> Co-authored-by: Matthew Sykes <sykesmat@us.ibm.com> Co-authored-by: pratikpatil024 <pratikspatil024@gmail.com> Co-authored-by: Yoav Tock <tock@il.ibm.com> Co-authored-by: Matthew Sykes <matthew.sykes@gmail.com> Co-authored-by: David Enyeart <enyeart@us.ibm.com> Co-authored-by: Christopher Ferris <chrisfer@us.ibm.com> Co-authored-by: Brett Logan <brett.t.logan@ibm.com> Co-authored-by: Nao Nishijima <nao.nishijima.xt@hitachi.com> Co-authored-by: manish <manish.sethi@gmail.com> Co-authored-by: NIKHIL E GUPTA <negupta@us.ibm.com> Co-authored-by: Will Lahti <wtlahti@us.ibm.com> Co-authored-by: Wenjian Qiao <wenjianq@gmail.com> Co-authored-by: Hongbin Mao <hello2mao@gmail.com> Co-authored-by: Senthil Nathan N <cendhu@users.noreply.github.com> Co-authored-by: Jay Guo <guojiannan1101@gmail.com> Co-authored-by: Danny Cao <dcao@us.ibm.com> Co-authored-by: Tiffany Harris <tiffany.harris@ibm.com>
* Address more concerns highlighted by linters These changes remove dead code, add error checks, and use assign unused variables to the unnamed variable `_`. Signed-off-by: Matthew Sykes <sykesmat@us.ibm.com> * Fixed write_first_app.rst typo Signed-off-by: pratikpatil024 <pratikspatil024@gmail.com> * FAB-17840 Ch.Part.API: Join channel REST handler (hyperledger#1305) * FAB-17840 Ch.Part.API: Join channel REST handler Implement a handler for a POST request to join a channel. Here we support requests carrying application/json, support for additional Content-Type will be added later. Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: I8d09e3cba09842f2adc47fb60161eee814e33d31 * Review: simplify code Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: Iee1e8b66cb77f64b9762dee8f85304958081e0fe * Review comments: simplify code - extract error handling to separate method - extract channelID extraction to separate method, reuse - test Accept header allowed range - better comments Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: I11639a3f159694019e521e180e7fd5fadb42cb4f * Review comments: support for multipart/form-data Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: Ic5a0307f56be5561f45910a02892dc7e7b9554d1 * Review comments: remove support for application/json Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: I38bb3564a0e6482b7bf9dca25bd8424e6db2ac95 * Spelling: s/chainocde/chaincode/g Signed-off-by: Matthew Sykes <matthew.sykes@gmail.com> * Retire dormant Fabric maintainers Retire maintainers that have been inactive for the last 3 months: Jonathan Levi Srinivasan Muralidharan Note: "A maintainer removed for inactivity should be restored following a sustained resumption of contributions and reviews (a month or more) demonstrating a renewed commitment to the project." Signed-off-by: David Enyeart <enyeart@us.ibm.com> * add NOTICE file recommended per ASF and LF policy Signed-off-by: Brett Logan <brett.t.logan@ibm.com> * Add NOTICE to license ignore list Signed-off-by: Brett Logan <brett.t.logan@ibm.com> * Fix some typo in docs This CR fixes the following some typo in docs. - chainocode -> chaincode - chainode -> chaincode - lifecyle -> lifecycle - Hyperlegder -> Hyperledger - chanel -> channel - scructured -> structured - demostrate -> demonstrate - certficates -> certificates - how how -> how - the the -> the - a a -> a - thefollowing -> the following Signed-off-by: Nao Nishijima <nao.nishijima.xt@hitachi.com> * Remove txmgr interface This commit removes the interface TxMgr, as there is a single implementation (i.e., LockbasedTxmgr). All the dependent packages are able to use this single implementation directly. The only exception is the validation package that causes a circular dependency. the validation package uses only a single function from this interface (NewTxSimulator). The validation package uses this function for getting accessing to the TxSimulator, in order to allow for the execution a post-order transaction (only channel config transaction in the current code). For meeting this need, this commit now defines a local interface with a single function in the validation package itself. Signed-off-by: manish <manish.sethi@gmail.com> * FAB-17841: Ch.Part.API: Remove channel REST handler (hyperledger#1330) Handle DELETE request to remove a channel. - Resolve the removeStorage flag from config & query. - If system-channel exists - reject with StatusMethodNotAllowed and Allow header set to GET. - If channel does not exist - reject with StatusNotFound. - If successs - respond with StatusNoContent. Signed-off-by: Yoav Tock <tock@il.ibm.com> Change-Id: I78581df3c7f0cb99007edddc83eb7a6dca5eba07 * Update commercial paper doc to use enrollUser.js Signed-off-by: NIKHIL E GUPTA <negupta@us.ibm.com> * Use protolator from fabric-config Signed-off-by: Matthew Sykes <matthew.sykes@gmail.com> * Revert "Bump viper version to the last working commit" This reverts commit 5ad0a4f. Signed-off-by: Brett Logan <brett.t.logan@ibm.com> * _lifecycle ignore previous build failure during install (hyperledger#1280) When a chaincode build previously failed while installing the chaincode, _lifecycle should ignore the cached error and attempt to rebuild the chaincode. This is because we should assume the end user knows something about why the build may succeed on retry if they're reattempting an install. Also, update integration tests to not care about exit status of chaincode installs (since reinstalls now error). FAB-17907 Signed-off-by: Will Lahti <wtlahti@us.ibm.com> * [FAB-17927] Add AllChaincodesInfo to DeployedChaincodeInfoProvider (hyperledger#1331) Implement AllChaincodesInfo to query ledger and return chaincode info for all the deployed chaincode (both new lifecycle and legacy chaincodes) on a channel. This function is needed for ledger checkpointing and deletion of channel-specific databases in statecouchdb. Signed-off-by: Wenjian Qiao <wenjianq@gmail.com> * [FAB-17471] Fix OrdererType key to correct line SampleInsecureKafka profile tries to change consensus type to kafka using OrdererType. But it was written in the line above "<<: *OrdererDefaults". That causes the overwrite consensus type again. This CR moves OrdererType to the correct line. As a result, this CR can generate the correct ordererer type's genesis block. Signed-off-by: Nao Nishijima <nao.nishijima.xt@hitachi.com> * [FAB-17059] Add missing mspID when init MembershipProvider. Signed-off-by: Hongbin Mao <hello2mao@gmail.com> * fsync and do not generate empty files in snapshots (hyperledger#1345) This commit fixes following issues - Perform sync on the snapshot files - Does not generate empty files - Use filepath instead of path package Signed-off-by: manish <manish.sethi@gmail.com> * fix infinite loop during full range query (hyperledger#1347) We use a single scanner for achieving both paginated and non-paginated range query. We have internalQueryLimit and pageSize. For each _all_docs?startKey="XXX"&endKey="YYY" REST API call to CouchDB, we fetch atmost internalQueryLimit number of records only by appending limit=internalQueryLimit. When the requested pageSize is higher than the internalQueryLimit or the total number of records present in the given range is higher than the internalQueryLimit, iterator would execute the query again once the records fetched in the first cycle is consumed and so on. In order to do that, after an execution of the REST API in each cycle, it updates the initially passed startKey to the nextStartKey. If there is no nextStartKey, it is set to the endKey. Currently, when the nextStartKey and the endKey are the same, we still run one REST API call which is actually not needed as we always set inclusive_end=false. However, this causes infinite loop in a particular case. When we want to retrieve all the records, we would pass an empty string as the startKey and the endKey. When the startKey is an empty string, the REST API call would become _all_docs?endKey="YYY". When both are empty, it would become _all_docs Given that we set startKey to endKey when there is no nextStartKey and still execute one REST API call, it gets into infinite loop by fetching all records again and again. We avovid this infinite loop by setting scanner.exhausted = true whe the startKey and endKey become the same when there is no nextStartKey Signed-off-by: senthil <cendhu@gmail.com> * Remove unnecessary extension of osn (hyperledger#1351) In the integration test touched by this patch, cert file is read to remove and add node. It can be easily achieved by reading file bytes directly, without extending network to grab intermediate object. Signed-off-by: Jay Guo <guojiannan1101@gmail.com> * [FAB-17935] Change unnecessary warning log line to debug in gossip (hyperledger#1350) privdata Signed-off-by: Danny Cao <dcao@us.ibm.com> * [FAB-17900] Update BCCSP.PKCS11.Pin in examples - The Pin value must be quoted when specified in the yaml Signed-off-by: Tiffany Harris <tiffany.harris@ibm.com> * [FAB-17900] Fixes numeric env variable override bug Signed-off-by: Tiffany Harris <tiffany.harris@ibm.com> * Remove s390x, powerpc64le from RELEASE_PLATFORMS - Release automation only creates amd64 binaries for darwin, linux, and windows. - Continuous integration no longer runs on powerpc64le or s390x Also remove stale build tags related to plugins and race detection for old versions of go, s390x, and ppc64le. Signed-off-by: Matthew Sykes <sykesmat@us.ibm.com> * Backfill test for BCCSP environment overrides... ... and consistently use SW as the key for `SwOpts` in the configuration structures. Right now tags for mapstructure and JSON do not match the tags for YAML yet our sample configuration documents (in YAML) use `SW`. Signed-off-by: Matthew Sykes <matthew.sykes@gmail.com> * Updates in master for v2.1.1 release Update master doc and bootstrap script for v2.1.1 release. Signed-off-by: David Enyeart <enyeart@us.ibm.com> Co-authored-by: Matthew Sykes <sykesmat@us.ibm.com> Co-authored-by: pratikpatil024 <pratikspatil024@gmail.com> Co-authored-by: Yoav Tock <tock@il.ibm.com> Co-authored-by: Matthew Sykes <matthew.sykes@gmail.com> Co-authored-by: David Enyeart <enyeart@us.ibm.com> Co-authored-by: Christopher Ferris <chrisfer@us.ibm.com> Co-authored-by: Brett Logan <brett.t.logan@ibm.com> Co-authored-by: Nao Nishijima <nao.nishijima.xt@hitachi.com> Co-authored-by: manish <manish.sethi@gmail.com> Co-authored-by: NIKHIL E GUPTA <negupta@us.ibm.com> Co-authored-by: Will Lahti <wtlahti@us.ibm.com> Co-authored-by: Wenjian Qiao <wenjianq@gmail.com> Co-authored-by: Hongbin Mao <hello2mao@gmail.com> Co-authored-by: Senthil Nathan N <cendhu@users.noreply.github.com> Co-authored-by: Jay Guo <guojiannan1101@gmail.com> Co-authored-by: Danny Cao <dcao@us.ibm.com> Co-authored-by: Tiffany Harris <tiffany.harris@ibm.com>
Type of change
Description
We use a single scanner for achieving both paginated and non-paginated range query.
We have
internalQueryLimit
andpageSize
. The internalQueryLimit controls the number of records fetched per REST API call. The pageSize controls the total number of records needed by the caller.A range query is done by performing
_all_docs?startKey="XXX"&endKey="YYY"
REST API call to CouchDB. The iterator fetches almostinternalQueryLimit
number of records by appendinglimit=internalQueryLimit
to the REST API.In the following two cases, iterator would perform REST API call multiple times.
The iterator would execute the query again once the records fetched in the first cycle are consumed and so on. In order to do that, after the execution of the REST API in each cycle, the iterator updates the initially passed
startKey
to the nextStartKey. If there is no nextStartKey, it is set to theendKey
. Currently, when the nextStartKey andendKey
are the same, we still run one REST API call which is actually not needed as we always set inclusive_end=false. However, this causes an infinite loop in the following case.When we want to retrieve all the records, the user would submit an open-ended query by passing an empty string as the
startKey
and theendKey
. Refer to chaincode APIs for details -- https://github.com/hyperledger/fabric-chaincode-go/blob/master/shim/interfaces.go#L112-L114 . When thestartKey
is an empty string, the REST API call would become_all_docs?endKey="YYY"
. When both are empty, it would become_all_docs
. Given that we set thestartKey
to theendKey
when there is no nextStartKey and still execute one REST API call after that, it gets into an infinite loop by fetching all records again and again.In this PR, we avoid this infinite loop by setting
scanner.exhausted = true
when thestartKey
andendKey
become the same when there is nonextStartKey
.Additional details
Related issues
FAB-17939